Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Semi-supervised K-means clustering algorithm based on active learning priors
CHAI Bianfang, LYU Feng, LI Wenbin, WANG Yao
Journal of Computer Applications    2018, 38 (11): 3139-3143.   DOI: 10.11772/j.issn.1001-9081.2018041251
Abstract734)      PDF (827KB)(402)       Save
Iteration-based Active Semi-Supervised Clustering Framework (IASSCF) is a popular semi-supervised clustering framework. There are two problems in this framework. The initial prior information is too less, which leads to poor clustering results in the initial iteration and infects the subsequent clustering. In addition, in each iteration only the sample with the largest information is selected to label, which results in a slow speed and improvement of the performance. Aiming to the existing problems, a semi-supervised K-means clustering algorithm based on active learning priors was designed, which consisted of initializing phase and iterating phase. In the initializing phase, the representative samples were selected actively to build an initial neighborhood set and a constraint set. Each iteration in iterating phase includes three steps:1) Pairwise Constrained K-means (PCK-means) was used to cluster data based on the current constraints. 2) Unlabeled samples with the largest information in each cluster were selected based on the clustering results. 3) The selected samples were extended into the neighborhood set and the constraint set. The iterating phase ends until the convergence thresholds were reached. The experimental results show that the proposed algorithm runs faster and has better performance than the algorithm based on the original IASSCF framework.
Reference | Related Articles | Metrics
Active semi-supervised community detection method based on link model
CHAI Bianfang, WANG Jianling, XU Jiwei, LI Wenbin
Journal of Computer Applications    2017, 37 (11): 3090-3094.   DOI: 10.11772/j.issn.1001-9081.2017.11.3090
Abstract477)      PDF (756KB)(506)       Save
Link model is able to model community detection problem on networks. Compared with other similar models including symmetric models and conditional models, PPL (Popularity and Productivity Link) deals more types of networks, and detects communities more accurately. But PPL is an unsupervised model, and works badly when the network structure is unclear. In addition, PPL is not able to utilize priors that are easily captained. In order to improve its performance by using as less as possible, an Active Node Prior Learning (ANPL) algorithm was provided. ANPL selected the highest utility and easily labeled pairwise constraints, and generated automatically more informative labeled nodes based on the labeled pairwise constraints. Based on the PPL model,a Semi-supervised PPL (SPPL) model was proposed for community detection, which combined the topology of network and node labels learned from the ANPL algorithm. Experiments on synthetic and real networks demonstrate that using node priors from the ANPL algorithm and the topology of a network, SPPL model excels to unsupervised PPL model and popular semi-supervised community detection models based on Non-negative Matrix Factorization (NMF).
Reference | Related Articles | Metrics
Semi-supervised community detection algorithm using active link selection based on iterative framework
CHEN Yiying, CHAI Bianfang, LI Wenbin, HE Yichao, WU Congcong
Journal of Computer Applications    2017, 37 (11): 3085-3089.   DOI: 10.11772/j.issn.1001-9081.2017.11.3085
Abstract513)      PDF (758KB)(518)       Save
In order to solve the problem that large amounts of supervised information was needed to achieve satisfactory performance, owing to the implementation of the semi-supervised community detection methods based on Non-negative Matrix Factorization (NMF) which selected prior information randomly, an Active Link Selection algorithm for semi-supervised community detection based on Graph regularization NMF (ALS_GNMF) was proposed. Firstly, in the iteration framework, the most uncertain and informative links were selected actively as prior information links. Secondly, the must-link constraints of these links, which generated the prior matrix, were added to enhance the connections in a certain community. At the same time, the cannot-link constraints were added, which modified the adjacency matrix, to weaken the connections between communities. Finally, the prior matrix was used as a graph regularization term to incorporate into the optimization objective function of NMF. And combining with network topology information, higher community discovery accuracy and robustness were achieved with less prior information. At the same prior ratio on both synthetic and real networks, experimental results demonstrate that the ALS_GNMF algorithm significantly outperformes the existing semi-supervised NMF algorithms in terms of efficiency, and it is stable especially on networks with unclear structure.
Reference | Related Articles | Metrics